<html>
<head><meta charset="utf-8"><title>GEP inbounds · t-lang/wg-unsafe-code-guidelines · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/index.html">t-lang/wg-unsafe-code-guidelines</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html">GEP inbounds</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="165022422"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165022422" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165022422">(May 06 2019 at 21:54)</a>:</h4>
<p><span class="user-mention" data-user-id="120791">@RalfJ</span> not to derail the <code>-&gt;</code> thread but re: your note that you'd like to make <code>inbounds</code> GEPs a rustc internal optimization rather than an inherent part of field offsets (and presumably more generally <code>offset</code>).</p>
<p>I also don't care for the vague and apparently useless "must be in bounds of an object" part, but <code>inbounds</code> unfortunately ties that together with additonal, rather different, quite useful (and not as onerous for unsafe code) properties: that the address calculation doesn't wrap around and that the offset computation (excepting the addition to the base pointer) doesn't wrap in <em>signed</em> arithmetic. I am already skeptical about how practical it is to teach LLVM (let alone rustc) to figure out the "in bounds of an allocation" aspect from <code>derefenceable</code> attributes on related pointers, and the non-wrapping properties require even more reasoning and can also depend on the concrete base address. And unlike the "in bounds of an allocation" property, the absence of wrapping does enables a substantial number of optimizations which I'd rather not make dependent on fragile analyses that have work against the language semantics.</p>
<p>So my long-term hope is to rather separate these aspects in LLVM IR and then free Rust users from having to worry about the "in bounds" part but keep the non-wrapping aspects. Which, of course, means keeping <code>-&gt;</code> unsafe, though with fewer proof obligations (and in the case of field offsets, the compiler gives substantial assistance -- creating a type that would lead to field offsets <code>&gt; isize::MAX</code> is a compile time error).</p>



<a name="165053018"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165053018" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165053018">(May 07 2019 at 08:41)</a>:</h4>
<p>Hm... I basically had this thought when people said on llvm-dev that GEPi vs GEP does not have a huge amount of impact. But I guess we'd have to collect some data.</p>



<a name="165053104"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165053104" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165053104">(May 07 2019 at 08:42)</a>:</h4>
<blockquote>
<p>I am already skeptical about how practical it is to teach LLVM (let alone rustc) to figure out the "in bounds of an allocation" aspect from derefenceable attributes on related pointers, </p>
</blockquote>
<p>I think it'd be rather easy for rustc to determine when it can add <code>inbounds</code> -- if we are working on a reference, we can; if we are working on a raw ptr, we cannot.</p>



<a name="165053113"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165053113" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165053113">(May 07 2019 at 08:42)</a>:</h4>
<p>The only complication I see there is around accessing the last field of an unsized struct</p>



<a name="165072779"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165072779" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Tom Phinney <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165072779">(May 07 2019 at 13:56)</a>:</h4>
<p>In the short term rustc could simply omit the <code>inbounds</code> on that last field. That would still provide safety for the other fields.</p>



<a name="165080003"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165080003" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165080003">(May 07 2019 at 15:10)</a>:</h4>
<p>Yes, we should measure the impact, but doing this properly is difficult because absence of wrapping often affects rather situational and low-level optimizations which can nevertheless become important in hotspots.</p>
<p>For example, by a happy coincidence (seeing <a href="https://llvm.org/devmtg/2019-04/slides/TechTalk-Northover-A_tale_of_two_ABIs_ILP32_on_AArch64.pdf" target="_blank" title="https://llvm.org/devmtg/2019-04/slides/TechTalk-Northover-A_tale_of_two_ABIs_ILP32_on_AArch64.pdf">this talk</a>) I can point you at the arm64_32 target where GEPs without <code>inbounds</code> result in much worse machine code (have to mask off high bits after every GEP) than those with <code>inbounds</code>. You couldn't see that while testing on x86 or most other targets, but it's going to be a real pain if someone ever seriously targets this kind of platform with Rust.</p>
<p>Another major use of "no wrapping" I'm aware of is in the analysis of loops whose induction variable is a pointer (see <a href="https://github.com/llvm/llvm-project/blob/71e8c6f20fe4c5d9cfd6235c360d602f0d29a6ab/llvm/lib/Analysis/LoopAccessAnalysis.cpp#L1023-L1029" target="_blank" title="https://github.com/llvm/llvm-project/blob/71e8c6f20fe4c5d9cfd6235c360d602f0d29a6ab/llvm/lib/Analysis/LoopAccessAnalysis.cpp#L1023-L1029">here</a> for example). I'm not sure if you want to change anything about <code>ptr.offset(n)</code> or just built-in field accesses (changing <code>offset</code> is required for fixing the slicing problem that led to that llvm-dev thread), but if <code>offset</code> also loses the <code>inbounds</code>, then e.g. automatic vectorization of loops based on slice iterators could become more fragile.</p>



<a name="165108425"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165108425" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165108425">(May 07 2019 at 20:33)</a>:</h4>
<p>note that I am only suggesting this for offset-on-raw-ptr, not for all offsets. but I get your point.</p>



<a name="165108577"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165108577" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165108577">(May 07 2019 at 20:35)</a>:</h4>
<p>"nowrap" instead of "inbounds" would be nice if we used it for <em>all</em> projections (answering questions such as <a href="https://github.com/rust-lang/rust/issues/54857" target="_blank" title="https://github.com/rust-lang/rust/issues/54857">https://github.com/rust-lang/rust/issues/54857</a>). but when it comes to raw ptr projections, as far as spec and Miri complexity goes, both are equally annoying (as on, both need a special case and put extra burden on unsafe code authors in code that already is very hard to get right)</p>



<a name="165156806"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165156806" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> RalfJ <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165156806">(May 08 2019 at 12:08)</a>:</h4>
<blockquote>
<p>that the offset computation (excepting the addition to the base pointer) doesn't wrap in signed arithmetic</p>
</blockquote>
<p>just to be sure I remember correctly, this really was with the base being interpreted unsigned but the offset signed, right?</p>



<a name="165158114"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP%20inbounds/near/165158114" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Hanna Kruppe <a href="https://rust-lang.github.io/zulip_archive/stream/136281-t-lang/wg-unsafe-code-guidelines/topic/GEP.20inbounds.html#165158114">(May 08 2019 at 12:30)</a>:</h4>
<p>Yes. This is the part that actually makes allocations larger than isize::MAX bytes problematic.</p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>